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Abstract 

An upper bound to the information capacity of a wavelength-division multi- 
plexed optical fiber communication system is derived in a model incorporating the 
nonlinear propagation effects of cross-phase modulation (XPM) . This work is based 
on the paper by Mitra et al. finding lower bounds to the channel capacity, in 
which physical models for propagation are used to calculate statistical properties of 
the conditional probability distribution relating input and output in a single WDM 
channel. In this paper we present a tractable channel model incorporating the ef- 
fects of cross phase modulation. Using this model we find an upper bound to the 
information capacity of the fiber optical communication channel at high SNR. The 
results provide physical insight into the manner in which nonlinearities degrade the 
information capacity. 

1 Introduction 

Communication via optical fibers has received a lot of attention recently mainly due 
to the extremely high bandwidths and unique propagation environments they provide. 
These characteristics provide a rather appealing medium for multichannel communica- 
tion through the fiber, known as Wavelength Division multiplexing (WDM), which has 
been the focus of much ongoing research and practical consideration, recently QllSEl- 
However, optical fibers have a number of nonlinear electromagnetic phenomena which 
affect the propagation of signals. As we increase the rate of communication through the 
fiber, the effect of these nonlinear phenomena will limit the reliability of communication 
through the fiber. From an information theoretic point of view these phenomena will 
limit the information capacity of the fiber optical communication channel. In order to 
derive this limit we need to study and model the effects of such phenomena on the inputs 
to the channel as they propagate through the fiber. This was first done by Mitra et 
al. p. 

In this paper we shall introduce the nonlinear effects in a fiber. We shall model the 
effects of one of the nonlinear effects, known as Cross Phase Modulation (XPM) on the 



propagation of signals through the fiber and derive an upper bound on the capacity of 
the optical channel considering the effects of XPM. 

In section |2] we shall describe the propagation of an electromagnetic wave through 
the fiber, where our signal is used to modulate the wave. In section El we shall use our 
propagation equation to find a relation between the input and output of a fiber, viewed 
as the communication channel. This is done by considering the Green's function that 
relates the input and output of the fiber medium. We shall then simplify the Green's 
function in section EJ This simplification results in a tractable channel model. We find 
an upper bound to the information capacity of the simplified channel model in the high 
SNR regime in sectional 

2 Signal Propagation in Optical Fibers 

In this section we shall describe the propagation of an input signal in an optical fiber, 
based on Mitra et al. and Agrawal j3] , which shall be used to derive a communication 
theoretic channel model in the next sections. 

In a single mode fiber used for Wavelength Division Multiplexed (WDM) transmission, 
there are a number of channels N, each used by an independent user. Each user uses 
an input signal Xk{t) to modulate the carrier wave of frequency Uk and has a bandwidth 
of B <^ z/fc. We shall denote the carrier frequency of the central channel by uq, also 
the carrier frequency spacing of neighboring channels is given by Su. Since the users 
are independent the input signals to different channels, Xk{0,t), are also independent 
each having a power constraint: i?f(|xfc(0, < P, Wk. Along the fiber Xk{z,t) is the 
amplitude of the electric field in channel k with propagation constant (3^, were z the 
distance travelled along the fiber. The total electric field resulting from signals in N 
channels is given by, 

N/2 

E{z, t) = ^ [xk{z, t) exp{i{PkZ - ^nukt)) + xl{z, t) exp{-i{PkZ - ^nukt))] exp{-az/2) 

k=-N/2 

Equation ()2.H) accounts for the power loss due to absorption in the fiber through the 
exponential decay factor exp(— q;2;/2). Also we take the direction of polarization to be 
fixed and the traverse profile of the mode to be independent of z. The propagation 
constants Pk, are frequency dependent, making propagation through the fiber dispersive. 
This phenomenon is known as Group Velocity Dispersion (GVD) which results in different 
frequency components of a signal to travel at different speeds, resulting in distortion of 
the input signal. It also causes signals in different channels to travel at different speeds. 

In an optical fiber different frequency components interact (couple) to generate new 
frequency components, this phenomenon is known as Four Wave Mixing (FWM). Also 
due to the dependance of the refractive index of the fiber on the power of propagating 
signals, different channels interact. The interaction of a channel with itself is known as 
Self Phase Modulation (SPM) and the interaction of different channels with each other 
is known as Cross Phase Modulation (XPM). For further details refer to P and 

Given the phenomena explained above the signal in channel k, Xk{z,t) (written in 
the frame of reference that moves along the fiber with the group velocity of the central 



channel) propagates through the fiber based on the nonhnear Schrodinger equation, 

[d, + t(32d^]xkiz,t) = t^[\xk\^ + 2j2\xi\''](^M-az)Mz.t). (2.2) 

l^k 

Propagation along the fiber is thus characterized by a set of + 1 coupled non-linear 
partial differential equations. In Equation ()2.2|) . Pk{j^) has been Taylor expanded about 
z/q to take the form 

/?fc(z/) = /?(z/o + kSu) =Po + pikSu + p2k^Suy2 + 0{Su^), 

also, we have neglected the effect of Four Wave Mixing (FWM) since it can be shown pQ 
that the effect of FWM on the information capacity can be studied separately. As will 
shall see, SPM cannot be decoupled from the input signal and so its effect on the capacity 
requires a different set of techniques, which can be studied separately and will not be 
considered here P . However, XPM involves signals from other channels which are chosen 
independently, and so for each signal the effect of other channels is essentially random. 
Hence, in what follows we shall study the effect of the stochastic effects of XPM on the 
information capacity. 

After the neglect of SPM and FWM, the effective channel model is given by N+l 
coupled nonlinear PDE's, known as the equations of motion. The equation of motion for 
the central channel is given by, 

[<9. + y5']a;o(z,t) = Vo{z,t)xo{z,t), (2.3) 
Voiz,t) = 27^|xz(2;,t)|2exp(-«2), (2.4) 

where Vo{z,t) represents the XPM term for the central channel. Equation ()2.3|) is 
still non- linear because Vo{z,t) couples different channels to the central channel. Note 
that, apart from the interchange of spatial and temporal coordinates, Vo{z,t) enters 
Equation ()2.3|1 in the same way, and is therefore mathematically equivalent to, that of a 
potential entering the Schrodinger equation jB]. 

2.1 Equations of Motion 

The equations of motion are modeled by decoupling and linearizing Equations ()2.3j) and 
()2.4|) . As mentioned before the XPM term is mathematically equivalent to a potential, 
i>{z,t). This potential is modeled as a Gaussian random process, independent of all the 
channel inputs, in both space and time with the same first and second order moments 
as the XPM term, see Equations ()2.(i|l and ()2.7|1 . This model follows from the central 
limit theorem, considering the fact that at each instant in time and space, YIi^q \xi{z, t)p 
is the sum of many independent random variables, each with a finite second moment. 
Hence, the XPM term converges in distribution to a Gaussian random process [H]. Note 
that while the signal in each channel has both spatial and temporal correlation, we are 
adding many such signals which travel at different speeds in space, hence the correlations 
are lost. For further details see 

Note that i>{z, t) does not depend on the signals in other channels. The resulting 
model for propagation is linear, since the equations for different channels have been 
decoupled. Hence, the effective equation of motion for the central channel is given by, 

[id, - y 52]xo(^, t) = iy{z, t)xoiz, t) (2.5) 



E{u{z,t)) = E{Vo{z,t)), \ft,\fz, 0<z<L 



(2.6) 



E{u{zi,ti)u{z2,t2)) = E{Vo{zi,ti)Vo{z2,t2)), Vti,t2, yzi,Z2, < Zi, Z2 < L (2.7) 

were the expectations of Vq terms are taken over the joint distributions of the inputs to 
all channels. 



3 Channel Model 

In this section we obtain a stochastic channel model based on the equation of motion 
obtained in section |21 This stochastic model is simplified to result in a channel model 
which incorporates the effects of XPM which results in a multiplicative phase noise term. 

3.1 Effects of XPM 

Equation ()2.5p describes the effect of XPM on the relation between the input and output 
signal of the central channel. Based on this equation the channel model or the input- 
output relation can be described using the Green's function G{L; t, t'), also known as the 
propagator jOj, 

y{t) = xoiL;t)= Gmt,t')xoiO;t')dt' + n{t), (3.1) 



where L is the length of the fiber and n{t) = n{z, t)dz takes into account the effect of 
all additive noise, which we shall approximate with an additive white Gaussian random 
process, independent of all inputs to the channel 

The Green's function is a function of the random potential v{z, t) and can be written 
in the form of a path integral jH], 



G{L-t,t') = Jvtiz)exp - £ ^^(d,t{z)Ydz~z J\{t{ 



z), z)dz 



(3.2) 



where the paths t{z) start at t', at z = and end at t, at z = L. This expression is 
encountered in the context of Quantum Mechanics, where the roles of time and space are 
interchanged. We shall simplify this expression to obtain a channel model. 

Theorem 1. Simplifying Equation \3. combined with Equation \3. 1]) result in the 
following simplified channel model, 

r (t- 

y{t)= / exp(- r / )exp{-iLU{L;t',t))x{t')dt' + n{t), (3.3) 



where 
Here, 



U{L;t',t) r^Af{0,aUt-t')). (3.4) 

N/2 



2P' 



where P is the power in each channel, (32 is the propagation constant of the central 
channel, 6u is the channel frequency spacing and N is the number of channels. 



Proof. The rest of section El is devoted to proving Theorem (^. This is done by simph- 
fying and approximating the Greens function given by Equation (j3.2j) . 

The potential can be written as i^{z,t) = Eiyi^z^t)) + 5p{z,t). The average value of 
the random potential causes a deterministic constant phase factor, which has no effect 
on the information capacity of the channel. However, the fluctuations of the random 
potential about its average value effect the phase of the Green's function and hence the 
capacity of the chanel. Since the average potential has no effect on our results we shall 
assume E^v^z^t)) = 0, hence i^{z,t) = Sh'(z,t). 



3.2 "Green's Function" Approximation 

In this section we shall approximate the Green's function given by Equation (j3.2j) using 
stochastic and physical properties of the random potential. 

Refereing to [HI we could approximate the expression for G{L; t, t') by dividing both 
space and time into small intervals. We divide the time interval (t', t) into M equal 
subintervals. At, resulting in {ti}^!^, where to = t' and t^ = t. Also, we shall divide the 
fiber length into M equal subintervals AL. 

In all that follows we shall be concerned with the time interval (t', t), i.e. //(z, t") =0 
if t" ^ (t', t). However, we shall assume that the inputs start from a distant past and will 
continue into the distant future. This assumption is justified since the signals in different 
channels are not necessarily synchronized. And so we have [H], 



Vt{z) 



2/?/ dz ^""^ 

L 

i'{t{z), z)dz 




M 





M 



y— 



AL 
Y^v{h)AL. 



M-2- 



.dti, 



These approximations result in. 



G(L;t,t') 



Vt{z) exp 





exp 

M-l 



— {dzt{z)) dz-i u{t{z),z)dz 
Jo 

M-l . , , M-l 

y — 



n 

k=l 



^ tk—l~\2 
\2 



AL 

i (tk - tk-i)' 



AL-i^ u{tk)AL 



k=l 



dtM-idt]\j-2---dti 



2/32 



AL 



exp {—iu{tk)AL) dtni^idtM-2---dti. 

(3.5) 



Consider the terms containing the random potential in Equation ()3.5j) given by. 



exp(-iz/(tfc)AL) = 1 - iv{tk)AL - {u{tk)ALf + .. 



iu{tk)AL + 0{{ALf 



which can be approximated by neglecting the 0(AL)^ terms. This approximation is 
justified considering nominal values for the parameters, see 0. Hence, 



G(L;t,t') 




A/-1 

n 

k=l 



i (tA 



2/32 



AL 



[l -iv{tk)AL) dtM-idtM-2--dti 



evaluating these integrations results in, 



2(52L a 1 — a 



- iL f r exp (--±-[^LJ^ + (^^JT] ) ^^t^^dt^da 

Jo J -oo V 

= exp(- ^^^J ] -iLU{L-t\t) (3.6) 



Equation ()3.6p consists of two terms, the first term is a deterministic phase shift, while, 
the second term involves the weighted integration of the random potential v{ta)-, resulting 
in the random process U {L; t, t'). We need to characterize the distribution of the random 
process U{L] t, t'). 

3.3 Distribution of U{L; t, t') 

In this section we shall characterize the distribution of U{L;t,t'). This will result in 
characterizing the distribution of our approximation to the Greens function. 

Lemma 1. The distribution of the random process U{L;t',t) is given by, 

U{L;t',t)^M{0,al{t-t')). (3.7) 

with 



P 2P^ 1 2P 
at 



' 1 op^ 



Proof. An outline of the proof of Lemma is given in the appendix. □ 

In our derivation of the distribution of the Greens function we neglected the higher 
order terms in AL. If we follow the same tedious calculations for higher orders of AL we 
get the following approximation for the Greens function, 

G{L;t,t') ^ exp (-^^^) (1 - ^LUiL; t\ t) + (-^)^^^^^(^^ ^ ' + ■ ■ ■ ) 

= exp(^-2^^^^^exp(-zLt/(L;t',t)) (3.8) 

where the distribution of U{L;t',t) is given by Equation ()3.7j) . The proof of Equa- 
tion ()3.8p follows from the proof of Lemma ^ in the appendix and therefore not presented 
here. Combining Equation ()3.8|) with Equation p.ip results in the following approximate 
channel model, 

yit) = J exp(-z ) expi-tLUm t', t))x{t')dt' + n{t), 

which completes the proof of Theorem ^ D 



4 Simplified Channel Model 



In this section we use some physical characteristics of an optical fiber to further simplify 
the channel model given by Theorem ^ Equation ()3.3p is a standard representation of 
the following input output relation, 

r (t- t'Y 

Xo{L; t)= exp(- r / ) exp(-zL[/(L; t', t))xo(0; t')dt' + n{t). (4.1) 

As shown by Equation (jHH), the output at time t is affected by the input at time t' . The 
propagation (travel) time of the signal in the central channel is given by ^, where Ug is 
the group velocity of the central channel, see |S] and L is the fiber length. Hence, the 
input time interval t\ that affects the output at time t is a window in time around t — ^■ 
Propagation along the fiber is dispersive, i.e. signals tend to spread as they propagate 
along the fiber. However, from practical considerations, we know that the propagation 
distances used in practice are small enough to ensure that the signal spread is small 
compared to the propagation delay ^. Hence the input time interval t' that affects the 

output at time t is small compared to ^. 

As an example assume that the signals are bit streams of time duration T ^ ^ = 20ps. 
Nominal value for the fiber parameters are given by, group velocity Ug = 200, 000/cm/s, 

2 

propagation constant P2 = 20||^ and fiber length L = 50km. In most practical situations 
no more than m = ±100 neighboring bits will affect each bit. Hence, we have mT ^ 4ps 
and — ~ .25ms and so the time interval that affects the output at each time is much 
smaller than the propagation time of the signal. Hence, 

t-t'^—. (4.2) 

Based on this approximation U [L; t', t) is a stationary random process in time, hence 
we'll drop the time dependance in U{L] t', t). We also absorb the L in the channel model 
preceding U{L), into it's variance, resulting in, 

U{L)r^J^{0,aU-)iL')). 

^9 

The result of the approximation in ()4.2j) is to "lump" the effect of a "continuously injected 
multiplicative noise" U{L]t,t'), along the fiber into a lumped multiplicative noise U{L), 
in the form of a random phase shift, at the end of the fiber length L. The resulting 
channel model is given by, 

y{t) = J exp{-t^^^^)exp{-tUiL))xit')dt' + n{t) 

r it- t'Y 
= exp{-iU{L)) / exp(- r / )x{t')dt' + n{t) 

Note that / exp{—i^^^^)x{t')dt' is the convolution of the input signal x{t) with exp(— i^^), 
which results in a deterministic phase shift in the frequency domain which can be com- 
pensated for and so it has no effect on the information capacity of the channel or the 
capacity achieving input signal. As a result the input signal can be represented as, 

x(t) = J exp{-i^^^^)x{t')dt' . 



Finally we have the following simplified channel model, 



y{t) = exp{-iU{L))x{t) + n{t), (4.3) 
where f/(L) ~ Ar(0, a^(^)(L2)). 

5 channel capacity 

In this section we shall provide an upper bound to the information capacity of the channel 
given by Equation (|4.3|) for the high SNR regime. 

Consider the channel given by Equation (j4.3p . since each channel is bandlimited, the 
inputs to each channel are also bandlimited, hence our continues time channel model is 
equivalent to a discrete time channel model, see [TT] , 

yk = exp{-iuk)xk + (5.1) 
where u and n are independent i.i.d sequences with Uk ~ A/'(0, cr^(^)(L^)) and ~ 

Consider the channel given by Equation ()5.1|) . The input to the channel is given by 
{xfc} with Xjt G C with an input power constraint -E'(^ ^^=1 \xk\^) < P- The additive 
noise {n^} is an i.i.d. sequence of circularly symmetric complex Gaussian random vari- 
ables with Hk ~ A/'(0,(j|^), also the phase noise {uk\ is an i.i.d. sequence of Gaussian 
random variables with Uk ~ A/'(0,(T^), with the distribution of {uk} being independent 
of the input power. Also Uk has finite entropy. Note that all three process {x^}, {uk} 
and {wfe} are mutually independent. Given this setting we have the following lemma. 

Lemma 2. Given the above setting, the capacity of the channel given by Equation i5.1]} 
is upper bounded by, 



C < ilogfl + 2vrV2'^(")4V«(l) 
2 V (^nJ 



in the high SNR regime. Where the o(l) term tends to zero as tends to infinity. 

The proof of Lemma El follows from the proof in ^U] . As an example, consider the 
following nominal values for a typical WDM optical fiber, = 20|^, 7 = 1.2(VrA;m)~^, 

= 50GHz, Ug = 200,000^, L = 50km, and = 100. which result in the capacity 
upper bounded by. 



^ ^ 1 / .0118 

6 Conclusion 

An upper bound to the information capacity of a wavelength-division multiplexed optical 
fiber communication system is derived in a model incorporating the nonlinear propaga- 
tion effects of cross-phase modulation (XPM). We modeled the effects of the continuously 



injected phase noise due to the cross phase modulation nonhnearity as a lumped multi- 
plicative phase noise at the end of the fiber. This model leads to an upper bound to the 
capacity of a WDM fiber optical communication channel in the high SNR regime. This 
upper bound is in agreement with the lower bound derived by Mitra et al. pp. 

Future directions include better models for the effect of various fiber nonlinearities, 
including XPM. Also upper bounds for the low SNR regime and tighter upper bounds for 
the high SNR regime could be derived as we believe our upper bound could be improved. 
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7 Appendix 

Outline of proof of Lemma U 

Carrying out the tedious calculations, 



U{L;t,t') 



1 POO 



J-oo V ^P2-t> 



r\2' 



a 



1 — a 



h'(ta)dtada 



exp 



H)it-t'y 



i\2\ n 



Ua{L] t', t)da, 



W2L ; JO 

as mentioned in Section i^{ta) is zero outside the time interval {t',t), hence. 



^l \ 2(32La{l — a] 

a(t-t') 



exp 



-D 



-{l-a)it-t') 
a{t~t') 

il-a){t~t') 



[ta — t{l — a) — t'(a;)]^J v{ta)dtada 

{taY ] 1-' {ta + t{l — a) + t'{a)) dtada 



2(32La{l - a 



exp 



2(32La{l - a 



V{ta)dta 



where = d equates the distribution of both sides, which follows from the stationarity of 
v{t) and the fact that both ta + t{l — a) + and ta cover the integration interval 
(—(1 — a){t — t'), a{t — t')), for any a G (0, 1). Note that we are only interested in the 
distribution of Ua{L; t', t). 

Since is a Gaussian process, we have that Ua{L]t',t) is also a Gaussian random 
process [7], i.e. Ua{L;t' ,t) ~ A/'(/i, a), where it can be can shown. 



E 



a{t-t') 



exp 



-{l-a){t-t') 



{taf y{ta)dt. 



a 



E[ J exp 
alit-f). 



2l32La{l - a) 



(ti)' z/(ti)rfti / exp 



2l32La{l - a) 



{t2? v{t2)dt2] 



Hence U^mt' ,t) ~ J\f{Q,ol{t - t')), where al = E[u{t)^] is the power of the Gaussian 
process uit) given by, 



9p2 2 



for details of this derivation see 

Note that Ua{L]t',t) is a jointly Gaussian random process in a. Hence, integrating 
Ua{L;t',t) results in a Gaussian random process [Zj, i.e. U{L;t',t) ~ Af{fiu,<Ju), where 
it can be shown, 

Hu = E[[ Uc{L;t',t)da] = 0, 
Jo 

E[[ U^{L-t\t)da [ U*,{L;t',t)da'] 
Jo Jo 



And so, 



alit-t'). 

UiL;t',t)r^AfiO,alit-t')). 
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